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THE  BOX  METHOD  FOR  LINEAR  PROGRAMMING: 
PART  1  -  BASIC  THEORY 

by  Karel  ZIKAN  and  Richard  W.  COTTLE 


1.  Introduction* 

Interior-point  linear  programming  algorithms  have  been  described  as  methods  in  which 
“one  plunges  into  the  interior  of  the  feasible  region”  [Kolata  (1984)]  instead  of  “crawling 
along  its  edges”  [Brown  and  Koopmans  (1951)].  One  way  or  another,  ellipsoids  play  an 
important  role  in  both  well  known  interior-point  algorithms  for  linear  programming.  Their 
role  is  quite  conspicuous  in  the  so-called  ellipsoid  method  of  Hacijan  (1979),  whereas  their 
role  is  more  subtle  in  the  projective  scaling  algorithm  of  Karmarkar  (1984a),  (1984b)  and 
its  affine  scaling  variants,  e.g.,  Adler,  Resende,  and  Veiga  (1986),  Barnes  (1986),  Vanderbei, 
Meketon,  and  Freedman  (1986).  Indeed,  Adler,  Resende,  and  Veiga  (op.cit.)  note  solving 
the  direction-finding  subproblem  is  tantamount  to  optimizing  the  main  problem’s  linear 
objective  function  over  an  ellipsoid.  It  is  generally  believed  that  the  direction-finding 
subproblem  can  be  a  heavy  computational  burden  in  an  otherwise  very  efficient  method. 
The  work  reported  here  was  inspired  by  efforts  to  reduce  this  computational  burden. 

Unlike  the  aforementioned  interior-point  algorithms,  the  one  presented  in  this  paper 
does  not  explicitly  or  implicitly  use  ellipsoids  in  its  direction-finding  scheme.  Instead,  this 
algorithm  uses  “boxes”,  or,  more  precisely,  parallelepipeds.1  Each  such  box  is  associated 
with  a  basic  solution  of  the  constraints  of  the  problem.  The  computation  of  each  search 
direction  is  done  over  the  current  box.  These  subproblems  are  linear  programs  having  closed 
form  solutions.  Accordingly,  this  interior-point  linear  programming  algorithm  makes  no  use 
of  nonlinear  programming. 

The  algorithm  presented  here  works  with  linear  programs  whose  constraints  are  ex¬ 
pressed  as  linear  inequalities.  It  is  initiated  at  an  interior  feasible  point  which  is  assumed 
to  be  available.  A  way  to  obtain  such  a  point,  if  one  exists,  is  discussed  in  Adler,  Resende, 
and  Veiga  (1986)  and  Murty  (1986).  The  problem  formulation, ‘relevant  assumptions,  and 
motivation  for  the  algorithm  axe  set  forth  in  Section  2.  The  algorithm  is  stated  in  Section  3 
and  shown  to  converge  in  Section  4.  As  a  by-product  of  the  methodology  introduced  here, 
one  obtains  a  finite  subdivision  of  Rn;  it  is  presented  in  Section  5.  This  much  will  complete 
the  basic  theory  of  the  box  method  for  general  linear  programs. 

lIn  light  of  how  difficult  it  is  to  pronounce  this  word,  to  say  nothing  of  spelling  it,  we  hesitate  to  propose 
the  name  “parallelepiped  algorithm.” 
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A  specialization  of  the  algorithm  to  minimal  cost  network  flows  and  in  particular  to 
the  dual  of  the  standard  transportation  problem,  as  well  as  implementation  issues  and 
computational  experience  for  the  general  linear  programming  case  will  be  discussed  in 
subsequent  reports. 

2.  Formulation,  Assumptions,  and  Motivation, 

The  algorithm  described  in  this  paper  is  designed  for  linear  programs  of  the  form 

maximize  cTx 
subject  to  Ax  <  b 

Such  a  problem  is  dual  to  one  of  the  (so-called)  standard  form 

minimize  (2.1) 

subject  to  A^y  =  c  (2.2) 

y  >  0  (2.3) 

Accordingly,  the  algorithm  may  (but  need  not)  be  thought  of  as  a  “dual  method.”  Let 

X  =  {x  :  Ax  <  b }.  (3) 

Regarding  the  data  of  (1),  we  make  the  following  assumptions: 

(Al)  Ae  Rm*n,be  Rm,ce  Rn,m>rv, 

(A2)  rank(A)  =  n; 

(A3)  ||A,-.||2  =  1  t  =  l,...,m; 

(A4)  3x°  €  iT*  such  that  Ax°  <  b; 

(A5)  {i6X:  cTx  >  cTx0}  is  bounded. 

Assumption  (A2)  is  not  absolutely  necessary,  but  it  greatly  simplifies  the  exposition  of  the 
method.  Assumption  (A3),  which  says  that  the  rows  of  A  have  unit  length  in  the  Euclidean 
norm,  can  be  made  to  hold  unless  A  has  a  row  of  zeros,  which  would  be  quite  abnormal. 
Assumption  (A4),  while  not  automatic,  is  not  unusual,  either.  Initialization  schemes  have 
been  proposed  in  the  literature.  This  assumption  implies  that  x°  is  an  interior  point  of  X. 
The  inclusion  of  assumption  (A5)  is  undesirable,  but  not  unusual. 

For  any  vector  x  €  X,  define  the  nonnegative  (slack)  vector 


(1.1) 

(1.2) 


tt>(x)  =  b  —  Ax. 
2 


(4) 


In  light  of  the  normalization  assumption  (A3),  the  components  of  uij(x)  can  be  interpreted 
as  the  Euclidean  distances  to  the  corresponding  hyperplanes 

n 

Hi  =  {x  *  ^  =  (5) 

When  x  G  intX,  these  distances  are  all  positive.  For  such  a  vector,  one  can  define  a 
diagonal  matrix 


D  =  Diag(l/uq, . . . ,  l/u;m)  (6) 

where  w  =  u>(x). 

The  key  direction-finding  subproblem  in  a  standard  variant  of  Karmarkar’s  algorithm 
entails  solving  the  equation 


ATD2Adx  =  c,  (7) 

and  then  taking 

dw  =  -Adx.  (8) 

The  search  direction  dx  given  by  the  solution  to  (7)  can  be  interpreted  as  the  solution  to 

the  problem  of  maximizing  the  linear  form  cTz  over  the  ellipsoid 

Z  —  {z  \  zTArD2Az  <  k}  (9) 

(where  k  is  a  positive  constant).  One  can  think  of  this  ellipsoid  as  being  centered  at  a 
current  iterate  and  contained  within  the  polyhedron  X .  This  leads  to  the  notion  that  the 
direction  given  by  a  suitable  approximation  to  the  aforementioned  ellipsoid  may  produce  a 
direction  vector  that  is  cheaper  to  compute  yet  nearly  as  good  as  the  one  given  by  (7).  This 
is  where  the  interpretation  of  the  slack  variables  as  distances  to  the  bounding  hyperplanes 
becomes  important.  It  is  at  least  intuitively  clear  that  the  shape  of  an  ellipsoid  determined 
by  an  interior  point  close  to  a  vertex  will  not  be  strongly  influenced  by  the  more  remote 
bounding  hyperplanes,  i.e.,  those  for  which  wt  is  large.  This  suggests  trying  to  use  a  basis 
B  in  A  that  corresponds  to  a  set  of  “nearby”  hyperplanes. 

To  find  such  a  matrix,  consider  the  distances  (slack  variables)  wx  =  u>,(x)  corresponding 
to  the  interior  point  x.  There  will  exist  an  index  set  E  of  cardinality  n  such  that  the  vectors 
{A,.  :  *  6  E}  are  linearly  independent  and  ls  minimum.  A  matrix  of  the  type  Ag. 

(the  matrix  consisting  of  the  rows  Al#  for  all  i  €  E)  will  be  called  a  minimum- weight  basis 
corresponding  to  x.  For  a  given  point  x  there  may  be  several  such  sets  E.  Each  of  these 


3 


sets  determines  a  unique  point  x(E)  which  may  or  may  not  belong  to  X.  If  x(E)  belongs 
to  Xy  then  it  must  be  an  extreme  point. 

Instead  of  using  an  ellipsoid  Z  as  in  (9)  or  even  an  ellipsoidal  approximation  to  Z  for 
calculating  a  search  direction  at  x,  consider  the  parallelepiped 

P(E )  =  {z  :  -we  <  Ae.z  <  u>e}-  (10) 

Likewise,  we  is  the  subvector  of  w  corresponding  to  E .  For  brevity,  denote  the  n  x  n 
nonsingular  matrix  As.  by  B . 

The  “box  problem”  corresponding  to  x  is 


maximize  c'z 

subject  to  Bz  <  we 
—Bz  <  we 

Its  feasible  region  is  just  the  set  P(E )  defined  in  (10). 

To  see  that  the  box  problem  (11)  can  be  solved  in  closed  form,  let 

T  _  ;Tn 
c  =  c  a. 


(n.i) 

(11.2) 

(11.3) 


z  —  Bz. 


Then  (11)  can  be  written  as 


maximize  cxz 
subject  to  z  <  W£ 

~Z  <  W£ 


(14.1) 

(14.2) 
(14.3' 


To  form  (14.1)  one  needs  cT  which  can  be  gotten  from  (12)  through  an  L-U  factorization 
of  B.  Suppose 


Z  —  {m  *  •  •  •  i  *»}• 


For  j  =  1,. . .  ,n,  let 


ij  —  wi}  if  cj  >  0, 
ij  =  -wtj  if  Cj  <  0. 


i 


(16.1) 

(16.2) 


It  is  clear  that  the  vector  z  defined  by  (16)  is  optimal  for  (14).  The  same  factorization  of 
the  matrix  B  can  be  used  in  (13)  to  construct  an  optimal  solution  of  (11)  from  that  of  (14). 
Thus,  the  box  problem  has  a  closed  form  solution. 

For  each  t  E  Ey  the  set  P(E )  has  two  sides  (facets)  parallel  to  the  hyperplane  Hi  given 
by  (5).  The  vertex  of  P(E)  determined  by  z  =  B~lwE  is  analogous  to  the  point  x(E) 
mentioned  above.  In  fact,  x(E)  =  x  +  z.  The  hyperplanes  that  determine  z  are  parallel  to 
(a  subset  of)  those  incident  to  x(E).  Thus,  the  set  x  +  P(E)  fits  “neatly”  into  the  comer 
at  x(E ).  If  x(E)  is  nondegenerate,  the  fit  is  exact.  If  x(.E)  E  X,  and  z  =  B~lwE  solves  the 
box  problem  at  x,  then  moving  from  x  in  the  direction  z  with  a  step  length  of  1  yields  an 
optimal  solution  of  the  original  problem  (1). 

3.  Statement  of  the  Algorithm* 

This  section  presents  the  algorithm  in  its  general  form  for  linear  programming  purposes. 
Specializations  will  be  treated  in  later  reports. 

The  input  to  the  algorithm  consists  of  a  linear  program  in  the  form  (1)  with  data 
satisfying  (A1)-(A5),  and  a  real  number  9  such  that  0  <  0  <  1. 

Algorithm  Is  The  Box  Method 

Step  0.  Determine  x°  E  int  X.  Set  k  =  0. 

Step  1*  Given  xk  E  intX  compute  wk  :=  6  —  Axk. 

Step  2,  Determine  a  minimum-weight  basis  AEm  corresponding  to  x*. 

Step  3*  Compute  z*,  the  solution  of  (11),  via  (12),  (13),  (15),  and  (16). 


Step  4. 


Step  5. 


Define: 

qk  :=  Azk\ 

Pk  :=  min  {wkJq^  :  qk  >  0,  i  =  1, . . . ,  m}; 
fffc  :=  8pk- 


(17) 

(18) 
(19) 


If  Pk  —  1  and  cT  =  cTAgl.  >  0,  stop.  The  vector  x*+I  =  x*  4-  2*  is  optimal. 
Otherwise,  define 

xfc+1  =  x*  +  <7*2*.  '  (20) 


If  the  termination  criterion  is  satisfied,  stop;  construct  an  optimal  solution. 
Otherwise,  replace  k  by  k  4- 1  and  return  to  Step  1. 


Comment*  on  Algorithm  I. 

1.  Transversal  step.  The  initial  interior  point  x°  may  be  deemed  “too  far”  from  the 
boundary  of  X,  and  it  may  be  advantageous  to  “reinitialize”  the  algorithm  by  finding 
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another  interior  point  such  as  one  of  the  form 


x°  +  6pc  (21) 

where 

q  =  Ac  (22.1) 

p  =  min  {w°/qi  :  qi  >  0,  i  =  1, . . . ,  m}.  (22.2) 

The  point  given  by  (21)  yields  a  larger  objective  function  value  than  x°  does.  Indeed, 

cV  +  9pc)  —  cV  ss  0p||cjj2  (23) 

could  be  large.  The  transversal  step  (21)  is  analogous  to  the  opening  move  in  Murty’s 

gravitational  method  for  linear  programming.  See  Murty  (1986).  The  idea  of  moving  all 
the  way  to  the  boundary  in  the  gradient  direction  dates  back  at  least  as  far  as  the  1951 
paper  of  Brown  and  Koopmans. 

2.  Finding  a  minimum  weight  basis.  The  work  to  be  done  in  Step  2  entails  a  partial 
sorting  of  the  slack  variables  w f.  In  the  most  favorable  case,  the  n  smallest  components 
of  tvk  correspond  to  linearly  independent  rows  of  A  and  thereby  furnish  the  minimum- 
weight  basis  Ae .•  When  these  n  rows  are  linearly  dependent,  something  special  must  be 
done.  A  practical  algorithm  for  executing  this  step  will  be  given  in  a  forthcoming  report  on 
implementation  details  and  computational  experience.  At  present,  it  suffices  to  note  that 
finding  a  minimum  weight  basis  is  a  matroidal  problem  solvable  by  a  greedy  algorithm. 
[See  Papadimitriou  and  Steiglitz  (1982),  pp.  280-289.] 

3.  Use  of  asymmetric  boxes.  The  parallelepiped  P(E )  defined  in  (10)  and  used  in  Step  3 
is  symmetric  about  the  origin.  It  is  not  really  necessary  to  use  a  parallelepiped  having  this 
symmetry  property.  It  seems  plausible  that  allowing  “more  room”  toward  the  interior  of  X 
could  be  advantageous  from  the  computational  standpoint.  Thus,  for  example,  one  could 
define 


PM  =  {z  :  -pwE  <  Ae.z  <  tDf:},  p  >  1  (24) 

thereby  making  P(E)  =  P\(E).  The  important  point  is  to  capture  some  of  the  “shape”  of 
the  set  formed  by  the  intersection  of  the  halfspaces  whose  bounding  hyperplanes  are  the 
H i ,  i  6  £.  The  technique  for  solving  the  box  problem  over  is  much  the  same  as  given 
in  Section  2  and  will  not  be  repeated  here.  See  Figure  1. 
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Figure  1:  Asymmetric  boxes. 

4.  Step  size.  The  calculation  of  p *  in  Step  4  determines  how  far  it  is  possible  to  move  from 
x*  in  the  direction  zk  before  meeting  the  boundary.  Since  (as  will  be  shown  in  Section  4) 
the  objective  function  cTx  increases  in  the  zk  direction,  the  vector  qk  must  have  at  least 
one  positive  component,  for  otherwise  (A5)  would  be  violated.  The  parameter  9  keeps  the 
new  iterate  in  the  interior  of  the  feasible  region  X.  Like  p,  value  of  this  parameter  is  to  be 
chosen  by  the  user.  It  is  common  (in  similar  algorithms)  to  use  a  value  close  to  0.9.  The 
computational  implications  of  different  choices  of  p  and  6  will  also  be  discussed  in  future 
reports. 

5.  Optimality  criteria.  It  should  be  noted  that  in  Step  5,  the  shrinking  factor  9  is  not 
used  when  the  two  conditions  p*  =  1  and  cTA^i  >  0  obtain.  In  this  instance,  it  is  an  easy 
exercise  to  show  that  xfc+1  is  an  optimal  solution  for  (1).  Moreover,  c  is  the  vector  of  basic 
variables  in  an  optimal  basic  feasible  solution  (corresponding  to  the  basis  BT)  of  the  linear 
program  (2).  Thus,  when  thinking  of  (1)  as  the  dual  problem,  it  is  a  simple  matter  to 
reconstruct  a  solution  of  the  primal  problem  from  information  at  hand  when  a  solution  of 
the  dual  has  been  found. 

To  handle  the  situation  where  termination  must  be  induced,  Step  5  requires  a  user- 
supplied  convergence  criterion.  (See  Gill,  Murray,  and  Wright  (1981)  for  some  possibili¬ 
ties.)  Being  an  interior-point,  the  final  iterate  so  determined  can  be  only  approximately 
optimal.  Passage  to  a  “nearby”  optimal  vertex  of  X  can  be  facilitated  by  Gram-Schmidt 
orthogonalization2  followed  (if  necessary)  by  application  of  the  (dual)  simplex  algorithm  to 
the  standard  form  problem,  (2).  This  procedure  terminates  with  the  usual  pair  of  primal 
and  dual  optimal  solutions. 

JThe  vertex  gotten  this  w ay  is  necessarily  an  improvement  over  the  interior  point  from  which  it  was 
derived. 
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4.  Convergence. 

This  section  is  devoted  to  proving  the  convergence  of  Algorithm  I.  It  will  be  shown  that 
starting  from  any  interior  point,  the  algorithm  generates  a  convergent  sequence  of  iterates 
whose  limit  must  be  a  vertex  of  the  feasible  region,  X.  It  will  also  be  shown  that,  when 
this  vertex  is  nondegenerate ,  it  must  be  an  optimal  solution  of  the  linear  program  (1)  and 
moreover  must  be  reached  after  a  finite  number  iterations.  Since  each  iteration  involves 
only  finitely  many  operations,  the  algorithm  is  finite  in  this  case.  It  is  well  known  in 
the  folklore  that  the  feasible  regions  of  most  real  linear  programs  have  degenerate  vertices, 
hence  recognition  and  handling  of  this  fact  is  an  essential  part  of  a  theoretically  satisfactory 
linear  programming  algorithm.  These  aspects  of  the  algorithm  will  be  addressed  later.  As 
a  practical  matter,  the  solution  recovery  procedure  pertaining  to  Step  5  that  was  sketched 
in  the  fifth  comment  in  Section  3  could  be  employed  to  deal  with  the  case  of  termination 
at  a  suboptimal  solution. 

The  convergence  proof  will  be  developed  in  a  series  of  lemmas.  Most  of  them  will  use 
the  key  idea  of  the  constancy  of  a  certain  “significant  basis.”  The  purpose  of  the  next  two 
paragraphs  is  to  define  this  notion  and  the  related  concept  of  “significant  direction.” 

The  m  x  n  matrix  A  has  rank  n,  hence  it  has  at  least  one  n  x  n  nonsingular  submatrix. 
Any  such  matrix  is  called  a  basis  in  A ;  for  a  given  matrix,  the  set  of  all  bases  in  A  will  be 
denoted  6.  For  each  member  B  of  6,  there  is  a  corresponding  set  E  of  row  indices  such 
that  B  =  Ae .*  This  correspondence  makes  it  possible  to  identify  basis  matrices  with  basis 
index  sets,  and  to  speak  of  the  latter  as  bases — indeed,  to  speak  of  these  index  sets  as  the 
elements  of  B. 

Recall  that  for  each  x  €  X,  there  is  a  slack  vector  tc(x)  =  b  —  Ax.  Given  such  an  x,  let 
B(x )  denote  the  set  of  all  E  €  B  such  that 

y;  u\(x)  <  ^  Wi  for  all  Ef  6  B 

ieE  i€E* 

The  set  B(x)  is  clearly  nonempty.  Its  elements  are  called  significant  bases  at  x.  This  set  is 
well  defined  for  all  points  in  i?71  and,  in  particular,  all  points  of  X. 

Given  x  €  X,  let  Z(x)  be  the  set  of  all  directions  that  could  be  chosen  as  solutions  to 
the  box  problem  in  accordance  with  Algorithm  I.  That  is,  . 

Z(x)  =  |J  {z  :  z  solves  (11)  via  (12),  (13),  (15),  and  (16)} 

£€B(x) 

By  making  suitable  and  obvious  modifications  in  the  definition  of  the  box  problem,  one 
can  extend  the  definition  of  Z(x)  to  all  points  of  X.  The  set  Z(x)  contains  the  zero  vector 
if  and  only  if  x  is  a  vertex  of  X,  in  which  case  0  is  the  only  member  of  Z(x).  Thus,  when 
x  is  not  a  vertex  of  X,  the  members  of  Z(x)  are  called  significant  directions  at  x. 


The  following  lemma  deals  with  significant  bases  of  points  in  the  feasible  region  A\  but 
not  necessarily  with  Algorithm  I. 

Lemma  1.  For  every  x  6  X  there  exists  an  epsilon  neighborhood  .V(x)  such  that 

B(x)  Du{B{z)  :  z  e  N(x)nX).  (25> 

Proof.  Suppose  there  is  a  point  x  6  -V  for  which  no  such  neighborhood  exists  Then  there 
is  a  sequence  {x*}  converging  to  x  such  that  for  each  xk  there  is  a  corresponding  significant 
basis  not  belonging  to  8(f).  (These  bases  could  then  be  called  “insignificant”  at  f.)  Since 
there  are  only  finitely  many  bases  in  A  (i.e.,  B  is  finite),  there  is  a  subsequence  of  |f*} 
for  which  the  aforementioned  insignificant  basis  at  f  is  the  same  for  every  member  of  the 
subsequence.  Let  this  basis  be  E' .  Thus,  E*  £  8(f).  Let  F*  denote  any  element  of  8(f) 
Now 


T:  wt(ik)  <  ^  wt(xk)  for  all  k  (in  the  sequence),  (2G) 

•  *6  E' 

and,  because  E'  is  not  a  significant  basis  at  f, 

lim  £  w,(ik)  =  £  u\(x)  >  ^2  w>( *)•  (27) 

t  €£'  HE'  HE' 

A  contradiction  can  now  be  obtained  by  talcing  limits  in  (26)  and  combining  the  resulting 
inequality  with  the  one  in  (27).  ■ 

Remark.  At  any  iteration  k  of  Algorithm  I,  the  current  point  xk  belongs  to  the  interior  of 
X .  Likewise,  the  origin  is  interior  to  the  box  P(E ).  From  this  (and  the  natural  assumption 
that  c  is  nonzero)  it  follows  that  clzk  >  0  for  any  optimal  solution  of  the  box  problem. 
Thus,  with  xk+l  =  xk  +  a kzk ,  it  is  clear  that  Algorithm  I  has  the  strict  ascent  property 

cV+1  =  cV  +  crkcTzk  >  cV.  (28) 

By  virtue  of  (28)  and  (A5),  the  iterates  xk  lie  in  a  compact  set.  Accordingly,  the  sequence 
{z*}  must  have  at  least  one  cluster  point.  Furthermore,  since  zk  ^  0  for  all  i\  it  follows 
that  the  sequence  of  normalized  direction  vectors  zk  =  (1/11**11)**  must  have  at  least  one 
cluster  point. 

Lemma  2.  If  {x*}  — +  x  and  {z*}  — ►  z  are  convergent  (sub)sequences  of  iterates  and 
corresponding  significant  directions  generated  by  Algorithm  I,  then  z  €  Z(x ). 
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Proof*  By  taking  subsequences  if  necessary,  one  can  assume  that  the  zk  are  all  solutions 
to  box  problems  formed  by  the  same  basis,  say  E.  This  means  that  the  matrix  B  =  AEm 
does  not  change.  By  Lemma  1,  E  must  belong  to  B{x).  Thus,  the  vector  c  in  (12)  does 
not  change  either.  Since  w{xk)  — ►  u>(x),  it  now  follows  from  (16)  that  zk  — ►  z.  Thus, 

z  -  limir*  =  liin-4£.rfc  =  AE.(limzfc)  =  AE.z , 

and  in  particular  z  satisfies  (13).  It  now  follows  that  z  solves  the  box  problem  associated 
with  x  in  accordance  with  Step  3  of  Algorithm  I.  This  means  z  £  Z(x).  B 

Lemma  3.  If  xk  is  a  convergent  (su'b)sequence  of  vectors  generated  by  Algorithm  I  and 
{f>k}  is  the  corresponding  sequence  of  step-sizes,  then 

pm  =  liminfp*  >  0.  (29) 

Proof.  The  sequences  can  be  refined  to  subsequences  in  such  a  way  that  the  step-size 
sequence  converges  to  p*  and  the  significant  bases  for  the  iterates  xk  are  all  the  same,  say 
E  Let  zk  be  the  corresponding  subsequence  of  significant  directions  and  let  qk  be  defined 
as  in  (  17),  while  pk  is  defined  as  in  ( IS).  The  sequences  can  be  refined  still  further  so  that 
the  same  subscript,  say  h,  occurs  as  the  minimizer  in  (18).  That  is,  pk  =  t ok/qk  for  all  k. 
Clearly 

limu’J  >  0  =>  lim  U'kk/qk  =  ivh/qh  >  0. 

Also,  if  h  £  £\  then  qk  =  u'k  for  all  k  and  lim*  wk/qk  —  1  >  0.  Therefore  wk  — ►  0  and  h  £  E 
are  two  necessary'  conditions  for  the  subsequence  of  p *  to  converge  to  0.  The  remainder  of 
the  proof  amounts  to  showing  that  these  two  conditions  lead  to  a  contradiction. 

For  any  vector  x  €  /?n,  let  70(i)  =  {*  :  u\(i)  =  0}.  Now  suppose  wk  -*•  0  and  h  £  E. 
Since  E  is  a  basis,  the  row  vector  Ah.  is  linearly  dependent  on  the  rows  of  AE.  •  Recall  that 
lim*  x*  =  x.  It  follows  that  h  £  /o(x).  Moreover,  Ah.  is  linearly  dependent  on  the  rows  of 
A  indexed  by  E  C\  /0(  j),  for  otherwise  h  would  have  been  chosen  as  an  element  of  E.  It  can 
be  shown  that  the  index  set  E  U  {/*}  contains  a  unique  minimum  cardinality  subset  C  for 
which  the  corresponding  rows  of  A  are  linearly  dependent.  One  of  these  indices  must  be  h . 
(In  the  language  of  matroids,  such  a  set  would  be  called  a  circuit.  See  Papadimitriou  and 
Steiglitz  (1982).) 

Each  significant  direction  vector  zk  can  be  decomposed  into  the  direct  sum  of  orthogonal 
components  zk  and  zk  where  zk  lies  in  the  subspace  spanned  by  the  normals  of  the  hyper- 
planes  whose  indices  lie  in  C .  These  zk  are  the  “significant  components”  of  the  direction 
vectors  in  the  sense  that 
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( Azk)i  =r  ( Azk)i  for  all  i  E  C. 


For  each  A:  in  the  sequence,  define 


ak  =  max  twf . 

«€C  * 

This  maximum  must  be  achieved  when  i  =  h,  for  otherwise  for  some  j  E  C.  Then 

after  a  replacement  in  J?  of  j  by  h,  a  new  and  “lower- weight”  basis  Ef  would  be  obtained, 
in  contradiction  to  the  defining  property  of  E.  Thus,  ak  =  tvk. 

To  obtain  a  bound  on  ||£*||,  recall  that  zk  lies  in  the  space  spanned  by  Ac .  and 

(Azk)i  =  ( Azk\  =  zk  for  all  t  E  C  \  {h}  (30) 

where  \zk\  =  wk  (as  in  (16)).  Assume  the  cardinality  of  C  is  L  +  1.  All  the  relevant  vectors 
lie  in  an  L-dimensional  subspace.  After  an  isometric  transformation  of  the  subspace  to  Rl , 
the  system  (30)  becomes  one  of  the  form 

Muk  =  y*, 

where  ||u*||  =  ||z||  and  ||y*||  =  ||z*||.  From  (17)  and  (A3)  one  gets 

si  =  <  M».||  P‘||  =  ||f*|| ; 

it  then  follows  that 

wi  i 

Pk  =  -T>  ~r = -  >  0. 

4  ~  vT||M-i|| 

This  inequality  holds  for  all  k.  Notice  that  the  quantity  \/ >/L\\M~x\\  is  independent  of  k 
and  hence  provides  a  positive  uniform  lower  bound  on  pk.  ■ 

Lemma  4.  If  {xfc}  is  a  sequence  of  iterates  generated  by  Algorithm  I  and  x  is  a  cluster 
point  of  {x*},  then  x  is  an  extreme  point  of  X. 

Proof.  Suppose  x  is  not  an  extreme  point  of  X.  Then  each  element  z  of  Z(x)  is  a  nonzero 
ascent  direction: 


cTx  <  cT(x  +  A z)  for  all  A  >  0. 

Let  {z*}  be  the  corresponding  sequence  of  significant  directions.  By  extracting  a  suitable 
subsequence  (if  necessary)  one  can  assume  x*  — ►  x  and  zk  — ►  z.  This  is  possible  since 
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Lemma  3  implies  that  ||z*||  cannot  get  arbitrarily  large  for  x*  in  some  epsilon  neighborhood 
N(x)  of  the  point  x.  Therefore  {z*}  must  have  a  cluster  point. 

Now,  for  the  subsequence  {x*}  as  just  constructed,  consider  the  sequence  {y*}  of  their 
Algorithmic  successors,  that  is,  the  points  of  the  form 

yk  =  xk  +  ckzk. 


Also, 


liminf  cTyk  >  cT[x  +  0(liminf  pk)z]  >  cTx. 

Thus,  c^yk  >  cTx  for  almost  all  Jfc.  This  implies  >  cTx  for  almost  all  k.  It  now  follows 

that 


cV  <  cTx  <  cV^1  —  cTx*  f°r  almost  all  it, 
and  this  is  a  contradiction.  ■ 

Lemma  5.  The  sequence  of  iterates  xk  generated  by  Algorithm  I  converges  to  a  vertex  of 
X . 

Proof.  Since  the  entire  sequence  lies  in  a  compact  set,  it  must  have  at  least  one  cluster 
point.  In  Lemma  4,  it  was  shown  that  any  such  cluster  point  is  a  vertex  of  A".  Thus,  it 
remains  to  show  that  {xk}  has  only  one  cluster  point. 

Suppose  {x*}  has  two  (or  more)  cluster  points,  say  vl, .  .  ,  vp  Let  N(vl ), . . .  ,  N(vp  ) 
be  disjoint  epsilon  neighborhoods  of  the  sort  described  in  Lemma  1.  The  union  of  these 
neighborhoods  contains  all  but  a  finite  number  of  the  xk.  Since  there  are  finitely  many 
vertices  of  X,  it  follows  that  (after  relabeling  if  necessary)  for  infinitely  many  values  of 
k ,  xk  €  N(vl)  and  x**1  6  N(v2).  Let  {x/}  denote  the  first  of  these  subsequences  and  let 
{y/}  denote  the  other.  Thus,  x*  — » t;1  and  y/  — »  t;2.  Note  that  y/  =  x/  +  <?(Z*  Define 

u  =  x*  +  ptz * . 

Since  {u*}  is  a  sequence  of  feasible  points,  all  its  cluster  points  belong  to  X  (There  must 
be  at  least  one  such  point.)  Without  loss  or  generality,  one  can  assume  that  {u'}  — »  u  It 
follows  from  these  definitions  that 


After  taking  limits,  one  has 
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v2  =  (1  -0)vl  +6u.  (31) 

Clearly  u  /  vl ,  for  v2  ^  v1.  Since  0  <  0  <  1,  equation  (31)  states  that  the  extreme  point 
v2  is  a  proper  convex  combination  of  distinct  points  of  X.  This  is  a  contradiction.  ■ 

Theorem.  Given  x°  6  intX,  let  {x*}  denote  the  sequence  of  points  generated  by  Algorithm 
Iy  and  let  x  denote  the  extreme  point  of  X  to  which  the  sequence  converges .  If  x  is 
nondegeneratef  then: 

(i)  x  is  an  optimal  solution  o/(l); 

(ii)  the  algorithm  terminates  after  finitely  many  iterations. 

Proof.  Since  x  is  a  nondegenerate  vertex,  it  corresponds  to  exactly  one  basis,  E.  As 
the  iterates  approach  x,  they  enter  an  epsilon  neighborhood  N(x)  of  the  sort  for  which 
(according  to  Lemma  1)  the  inclusion  (25)  holds.  In  other  words,  the  significant  bases  of 
the  iterates  become  (and  remain)  the  same  as  the  one  for  x,  namely  E .  Thus,  if  the  vertex 
X  is  not  optimal,  the  solution  to  (12)  must  have  at  least  one  negative  component,  say 
cy  This  implies  (Azk)%  =  — u\k,  indicating  that  wt  should  continually  increase  (instead  of 
converging  to  0).  This  contradiction  completes  the  proof  of  assertion  (i). 

Assertion  (ii)  follows  from  the  fact  that  after  finitely  many  steps,  the  iterates  reach  the 
aforementioned  neighborhood  N(x)  of  the  extreme  point  x.  From  any  feasible  point  of  this 
neighborhood,  the  algorithm  will  detect  the  optimality  of  x.  In  particular,  if  xk  £  JV(x), 
then  x**1  =  x  because  pk  =  1  and  c  >  0.  ■ 

5.  Minimum-weight  subdivisions. 

It  is  the  purpose  of  this  section  to  show  that  for  any  polyhedron  X  satisfying  assumptions 
( A1 )  ( A4),  the  concept  of  a  minimum- weight  basis  (as  defined  in  Section  2)  leads  to  a  finite 
subdivision  of  X,  and  indeed  to  the  entire  space  Rn.  Concerning  subdivisions,  see  Eaves 
(1976). 

Figure  2  below  illustrates  this  assertion.  In  this  example,  there  are  five  lines,  no  two 
of  which  are  parallel.  These  lines  have  pairwise  intersections  in  ten  points,  six  of  which 
are  labeled,  tq,. . . ,  The  polyhedron  X  in  question  is  a  quadrilateral,  the  convex  hull 
of  {ui,  vj,  v3,  i;4}.  Each  vertex  of  X  is  nondegenerate,  being  the  intersection  of  exactly  two 
lines.  The  other  two  labeled  points,  t>5  and  u6  are  nondegenerate,  but  are  not  members 
of  X .  The  four  remaining  points  are  also  nondegenerate.  To  each  labeled  point  t>,  there 
corresponds  a  polyhedral  subset  of  X  labeled  t  throughout  (the  interior  of)  which  the 
points  have  the  same  significant  basis  and  hence  correspond  to  the  intersection  of  the  same 
lines,  namely  tv  These  polyhedral  subsets  (called  ce/Js)  form  a  subdivision  of  X.  The  cell 
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corresponding  to  each  vertex  of  X  contains  that  vertex  as  well  as  points  in  its  neighborhood. 
As  Figure  2  shows,  these  four  cells  do  not  fill  up  X  as  cells  5  and  6  do  not  correspond  to 
vertices  of  X . 


Figure  2:  Minimum-weight  subdivision  of  a  polyhedron. 

The  other  polyhedral  regions  of  the  plane  are  the  solution  sets  of  linear  inequality  systems 
based  on  the  same  data  with  the  sense  of  one  or  more  inequalities  reversed,  i.e.,  <  replaced 
by  >. 

Suppose  the  polyhedron  X  in  Figure  2  is  the  feasible  region  of  a  linear  program  (1).  At 
each  interior  point  of  one  of  the  six  cells  of  X  the  box  problem  formed  in  Algorithm  I  is 
the  same.  A  point  of  X  belonging  to  the  intersection  of  two  or  more  cells  has  two  or  more 
significant  bases. 


In  Figure  2,  the  unique  point  (call  it  x)  belonging  to  cells  4,  5,  and  6  illustrates  Lemma 
1.  Within  a  sufficiently  small  epsilon  neighborhood  of  this  point,  the  set  of  significant  bases 
of  every  point  z  is  a  significant  basis  of  x.  If  the  epsilon  neighborhood  were  too  large,  it 
would  contain  points  for  which  the  basis  that  gives  rise  to  v&  is  significant  but  that  basis  is 
not  significant  for  this  x. 

The  following  discussion  is  aimed  at  generalizing  these  observations.  The  goal  is  to 
show  that  the  notion  of  minimum- weight  basis  induces  a  subdivision  of  JZn.  Many  of  the 
definitions  and  notations  are  the  as  those  in  Section  4,  but  the  objective  function  of 
the  original  linear  programming  is  of  no  immediate  consequence  for  this  purpose. 

Let  X  be  the  set  defined  in  (3)  where  A  and  b  satisfy  the  assumptions  stated  in  (Al)- 
(A4).  In  principle,  the  tv  vector  in  a  solution  (x,u?)  of  the  system  equations 

Ax  +  w  =  b  (32) 

could  belong  to  any  of  2m  orthants.  Requiring  that  w  belong  to  one  of  these  orthants  is 
equivalent  to  specifying  a  linear  inequality  system 

Ai.x  o  6,  i  —  1, . . .  m 

where  o  stands  for  <  or  >.  Each  such  system  corresponds  to  a  polyhedral  set  in  Rn.  Some 
of  these  sets  may  be  empty,  but  all  the  nonempty  ones  must  have  interior  points,  as  X 
does. 

For  each  point  x  €  Rn,  it  is  still  meaningful  to  define  the  combinatorial  optimization 
problem 

COP(x)  minimize  u\(x)  :  E  6  6}.  (33) 

i€E 

Since  B  is  finite,  COP(x)  has  at  least  one  optimal  solution.  As  before,  denote  the  set  of 
these  optimal  solutions  by  B(x).  An  element  of  B(x)  is  called  a  minimum-weight  basis. 

Definition.  If  x  and  y  belong  to  Rn,  then 

x  ~  y  <=>  B(x)  =  B(y).  (34) 

Lemma  6.  The  relation  ~  defined  in  (34)  is  an  equivalence  relation. 

Proof.  The  assertion  is  obvious  because  equality  is  itself  an  equivalence.  ■ 

Remark.  The  lemma  would  still  be  valid  if  equality  in  (34)  were  replaced  by  any  equiva¬ 
lence  relation  defined  on  the  set  of  all  subsets  of  B. 
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Definition.  If  x  and  y  belong  to  JZ",  then 

y  ■<  x  B(x)  C  B(y).  (35) 

Lemma  7.  The  relation  ■<  defined  in  (35)  is  a  partial  ordering. 

Proof.  Obvious.  ■ 

Lemma  8.  For  all  x  G  Rn,  the  set 

T(x)  =  {y  :  y  ^  x)  (36) 

is  a  polyhedron . 

Proof.  It  must  be  shown  that  T(x)  is  the  solution  set  of  a  finite  system  of  weak  linear 
inequalities.  To  see  this,  note  that  a  vector  y  belongs  to  T(i)  if  and  only  if 

r  u>,(y)  <  ^2  u\(y)  for  all  E 9  G  B(x)  and  all  E  €  B 
i€E'  i£E 

The  lemma  now  readily  follows  from  the  definition  u>,(y)  =  b,  —  Al#y.  ■ 

Remark.  The  equivalence  class 

K(x)  =  {y:y~i}  (37) 

is  the  relative  interior  of  T(x). 


16 


References. 


I.  Adler,  M.G.C.  Resende,  and  G.  Veiga,  An  implementation  of  Karmarkar’s  algorithm  for 
linear  programming,  Technical  Report  ORC  86-8,  Operations  Research  Center,  Department 
of  Industrial  Engineering  and  Operations  Research,  University  of  California,  Berkeley,  May 
1986. 

E.R.  Barnes,  A  variation  on  Karmarkar’s  algorithm  for  solving  linear  programming  prob¬ 
lems,  Mathematical  Programming  36  (1986)  174-182. 

G.W.  Brown  and  T.C.  Koopmans,  Computational  suggestions  for  maximizing  a  linear  func¬ 
tion  subject  to  linear  inequalities,  in  (T.C.  Koopmans,  ed.)  Activity  analysis  of  production 
and  allocation  (John  Wiley  &  Sons,  New  York,  1951),  374-376. 

B.C.  Eaves,  A  short  course  in  solving  equations  with  PL  homotopies,  in  (R.W.  Cottle  and 
C  E.  Lemke,  eds.)  Nonlinear  Programming,  SIAM- AMS  Proceedings,  v.  9,  (American 
Mathematical  Society,  Providence,  RI,  1976)  73-143. 

P.E.  Gill,  W.  Murray,  and  M.H.  Wright,  Practical  Optimization,  (Academic  Press,  London, 
1981.) 

L.G.  Hacijan,  A  polynomial  algorithm  in  linear  programming,  Doklady  Akademii  Nauk 
SSSR  244  (1979)  191-194. 

N.  Karmarkar,  A  new  polynomial-time  algorithm  for  linear  programming,  Proceedings  of 
the  16th  Annual  ACM  Symposium  on  the  Theory  of  Computing,  (1984a)  pp.  302-311. 

N.  Karmarkar,  A  new  polynomial- time  algorithm  for  linear  programming,  Combinatorica 
4  (1984b)  373-395. 

G.  Kolata,  A  fast  way  to  solve  hard  problems,  Science  225:4668  (21  September  1984)  1379- 
1380. 

K.G.  Murty,  The  gravitational  method  for  linear  programming,  Opsearch  23  (1986)  206- 
214. 

C  H.  Papadimitriou  and  K.  Steiglitz,  Combinatorial  Optimization,  (Prentice- Hall,  Engle¬ 
wood  Cliffs,  NJ,  1982.) 

R.J.  Vanderbei,  M.S.  Meketon,  B.A.  Freedman,  A  modification  of  Karmarkar’s  linear  pro¬ 
gramming  algorithm,  manuscript,  Algorithmic  a  1  (1986)  395-408. 


17 


UNCLASSIFIED 


HCWMTV  CLASSIFICATION  Of  TIMS  PAM  (Wl» 


DOCUMENTATION  PACE 


SOL-87-6 


UrOU  COMPLCTOO  FORM 


«■  TITH  fSPA  AA«f) 

The  Box  Method  for  Linear  Programming: 
Part  I:  Basic  Theory 


S.  TVPt  OP  MPORT  A  P(NOO  COVKRKO 

Technical  Report 


A.  PCRPORMIMO  OAO.  MPOAT  HUMPS* 


Karel  Zikan  and  Richard  W.  Cottle 


N00014-85-K-0343 


Stanford  University 
Stanford,  CA  94305 


It.  CONTROLLINO  OPPICS  MAMS  AMO  AS 

Office  of  Naval  Research 
800  N.  Quincy  Street 
Arlington,  VA  22217 


Dept,  of  the  Navy 


NR-047-064 


I*.  IMPORT  DATA 

June  1987 


OM«J  IS.  MCURITV  CL  ASA.  fW  I 

UNCLASSIFIED 


This  doc  Mien t  has  been  approved  for  public  release  and  sale; 
its  distribution  Is  unlimited. 


it.  mstrioution  statsmsnt  fw  am  < 


! 


linear  programming  interior-point  algorithm 

paral lelepipeds 

box  constraints 

mi nimum-wei ght  basis 

polyhedral  subdivision 

ik  AMTEACT  fCmmrn  «  iteyee  9i  i 

Please  see  attached. 

SOL  87-6: 


The  Box  Method  for  Linear  Programming:  Part  I:  Basic  Theory 
Karel  Zikan  and  Richard  W.  Cottle 


^This  paper  presents  a  new  Interior-point  algorithm  for  linear  programming 
where  the  constraints  are  all  expressed  as  Inequalities.  Along  with  the 
concept  of  ♦"minimum-weight  basis"*,  the  algorithm  features  a  novel  mechanism 
for  finding  search  directions.  Unlike  other  Interior-point  methods  which 
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direction-finding  schemes,  the  one  reported  here  uses  ^boxes  .  The 

corresponding  subproblems  are  simple  linear  programs  having  closed  form 
solutions.  It  Is  shown  that  the  Iterates  generated  by  the  algorithm  converge 
to  an  extreme  point  of  the  feasible  region.  When  this  point  Is 

nondegenerate.  It  Is  optimal  and  reached  within  finitely  any  steps.  The 
methodology  Introduced  here  also  gives  rise  to  a  polyhedral  subdivision  of 
the  problem’s  feasible  region  and  In  fact  to  the  entire  space  of  decision 
variables. 


